Cross-Language Information Retrieval with Incorrect Query Translations
نویسندگان
چکیده
In this paper, we present a Cross Language Information Retrieval (CLIR) approach using corpus driven query suggestion. We have used corpus statistics to gather a clue on selecting the right query terms when the translation of a specific query is missing or incorrect. The derived set of queries are ranked to select the top ranked queries. These top ranked queries are further used to perform query formulation. Using the re-formulated weighted query, we perform cross language information retrieval. The results are compared with the results of CLIR system with Google translation of user queries and CLIR with the proposed query suggestion approach. We have English and Tamil corpus of FIRE 2012 dataset and analyzed the effects of the proposed approach. The experimental results show that the proposed approach performs well with the incorrect translation of the queries.
منابع مشابه
Using co-occurrence tendencies to improve Cross-Language Information Retrieval
Query disambiguation is considered as one of the most important methods in improving the effectiveness of information retrieval. In the present paper, we focus on query terms disambiguation via, a combined statistical method both before and after translation, in order to avoid source language ambiguity as well as incorrect selection of target translations. By combining query expansion with dict...
متن کاملAdapting SMT Query Translation Reranker to New Languages in Cross-Lingual Information Retrieval
We investigate adaptation of a supervised machine learning model for reranking of query translations to new languages in the context of cross-lingual information retrieval. The model is trained to rerank multiple translations produced by a statistical machine translation system and optimize retrieval quality. The model features do not depend on the source language and thus allow the model to be...
متن کاملQuery Translation for Cross-lingual Information Retrieval using Wikipedia
In this paper the system WikiTranslate is introduced that performs query translation for cross-lingual information retrieval (CLIR) that only uses Wikipedia. Queries will be mapped to Wikipedia concepts and the corresponding translations of these concepts in the target language are used to create the final query. WikiTranslate is evaluated by searching with topics in Dutch, French and Spanish i...
متن کاملThe Use of Monolingual Context Vectors for Missing Translations in Cross-Language Information Retrieval
For cross-language text retrieval systems that rely on bilingual dictionaries for bridging the language gap between the source query language and the target document language, good bilingual dictionary coverage is imperative. For terms with missing translations, most systems employ some approaches for expanding the existing translation dictionaries. In this paper, instead of lexicon expansion, ...
متن کاملArabic/English Word Translation Disambiguation using Parallel Corpora and Matching Schemes
The limited coverage of available Arabic language lexicons causes a serious challenge in Arabic cross language information retrieval. Translation in cross language information retrieval consists of assigning one of the semantic representation terms in the target language to the intended query. Despite the problem of the completeness of the dictionary, we also face the problem of which one of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Polibits
دوره 54 شماره
صفحات -
تاریخ انتشار 2016